The American Journal of Pathology
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match The American Journal of Pathology's content profile, based on 11 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.
Ayad, M. A.; McCortney, K.; Congivaram, H. T. S.; Hjerthen, M. G.; Steffens, A.; Zhang, H.; Youngblood, M. W.; Heimberger, A. B.; Chandler, J. P.; Jamshidi, P.; Ahrendsen, J. T.; Magill, S. T.; Raleigh, D. R.; Horbinski, C. M.; Cooper, L. A. D.
Show abstract
Meningiomas are the most common primary brain tumors and, despite their benign reputation, often behave aggressively. Meningiomas are morphologically heterogeneous, yet the full significance of their histologic diversity is unclear. This is in large part because many features are not readily quantifiable by traditional observer-based light microscopy. Molecular testing improves prognostic stratification, but is not universally accessible. We therefore sought to determine whether an artificial intelligence (AI)-trained program could predict specific genomic and epigenomic patterns in meningiomas, and whether it could extract more prognostic information out of standard hematoxylin and eosin (H&E) histopathology than the current WHO classification. To do this, we developed Morphologic Set Enrichment (MSE), an interpretable computational pathology framework that quantifies statistical enrichment of morphologic patterns, cells, and tissue architecture from H&E whole-slide images. The MSE meningioma histology program was able to accurately predict DNA methylation subtypes and concurrent chromosome 1p/22q losses, in the process identifying specific morphologic patterns associated with key genomic and epigenomic alterations. It also added prognostic value independent of standard clinical and pathological variables. These results demonstrate that AI-based quantitative morphologic profiling can capture clinically and biologically relevant information that redefines risk stratification for meningiomas, incorporating histological information not included in existing grading schemes.
Kaistha, A.; Situ, J. J.; Evans, S. C.; Ashton-Key, M.; Ogg, G.; Soilleux, E. J.
Show abstract
T-cell lymphomas are often histologically indistinguishable from benign T-cell infiltrates. Clonality testing is frequently required for diagnosis. It lacks the spatial context and is slow and expensive, relying on complex, multiplexed PCR reactions, interpreted by experienced scientists or pathologists. We previously published details of a pair of highly specific monoclonal antibodies against the two alternatively used, but very similar, T-cell receptor {beta} constant regions, TCR{beta}1 and TCR{beta}2. We demonstrated the feasibility of immunohistochemical detection of TCR{beta}1 and TCR{beta}2 in formalin-fixed, paraffin-embedded (FFPE) tissue as a novel diagnostic strategy for T-cell lymphomas. Here we validate an improved pairing of TCR{beta}1/2 rabbit monoclonal antibodies, and demonstrate their utility for single and double immunostaining, including with a chimeric mouse anti-TCR{beta}2 antibody. Finally, we show that this staining is amenable to automated cell counting, permitting accurate calculation of the TCR{beta}2:TCR{beta}1 ratio.
Spirgath, K.; Huang, B.; Safraou, Y.; Kraftberger, M.; Dahami, M.; Kiehl, R.; Stockburger, C. H. F.; Bayerl, C.; Ludwig, J.; Jaitner, N.; Kühl, A.; Asbach, P.; Geisel, D.; Hillebrandt, K. H.; Wells, R. G.; Sack, I.; Tzschätzsch, H.
Show abstract
Background & AimsThe increasing global prevalence of metabolic dysfunction-associated steatotic liver disease (MASLD) including metabolic dysfunction-associated steatohepatitis (MASH) creates an urgent need for objective methods of histopathological assessment. Conventional histological approaches are time-consuming and rely on interpreters experience. Therefore, the results obtained may suffer from high variability and only offer coarse categorisation. In this study, we propose a fully automated, deep-learning-based pipeline for the segmentation and characterisation of histological liver features for MASH/MASLD assessment. MethodsSegmentation was applied to H&E sections from 45 mice and 44 humans with MASH/MASLD. The method, which we named qHisto (quantitative histology), utilises the nnU-Net framework and quantifies key histological components of the MASH score, including macro- and microvesicular steatosis, fibrosis, inflammation, hepatocellular ballooning and glycogenated nuclei. Additionally, we characterized the tissue using novel features that are inaccessible through manual histology, such as the distribution of fat droplet sizes, aspect ratio of nuclei and heatmaps. ResultsqHisto parameters showed strong positive correlations with conventional histology scores (fat area R=0.91, inflammation density R=0.7, ballooning density R=0.49) and also with quantitative magnetic resonance imaging (fat area vs. hepatic fat fraction R=0.87). Our novel scores showed that deformation of nuclei is driven by large fat droplets rather than the overall amount of fat. ConclusionsA key advantage of our method is spatially resolved, precise histological quantification. These features provide a finely resolved assessment of disease severity than conventional categorical scoring. By automating time-consuming and repetitive readouts, qHisto improves standardisation and reproducibility of MASH/MASLD feature quantification and provides scalable, slide-wide readouts that can support histopathologists and enhance clinical assessment and therapeutic development. Impact and ImplicationsThe proposed method provides an objective, automatic tool for comprehensive, histological liver analysis of MASH/MASLD, which can be extended to other diseases and organs. By offering classic and novel quantitative parameters and scores, our method could support histologists in their daily routines and provide researchers with further insight into steatotic liver diseases.
Shimizu, A.; Imamura, K.; Yoshimura, K.; Atsushi, T.; Sato, M.; Harada, K.
Show abstract
Drug-induced liver injury (DILI) is an acute inflammatory liver disease caused not only by prescription and over-the-counter medications but also by health foods and dietary supplements. Typically, DILI patients recover once the causative substance is identified and discontinued. In contrast, autoimmune hepatitis (AIH) results from the immune-mediated destruction of hepatocytes due to a breakdown of self-tolerance mechanisms. Patients presenting with acute-onset AIH often lack characteristic clinical features, such as autoantibodies, and require prompt steroid treatment to prevent progression to liver failure. Liver biopsy currently remains the gold standard to differentiate acute DILI from AIH; however, general pathologists face significant diagnostic challenges due to overlapping histopathological features. This study integrates pathology expertise with deep learning-based artificial intelligence (AI) to differentiate DILI from AIH using histopathological images. Our AI model demonstrates promising classification accuracy (Accuracy 74%, AUC 0.81). This paper presents a detailed pathological analysis alongside AI methods, discusses the current model performance and limitations, and proposes directions for future improvements.
Abolfathi, H.; Maranda-Robitaille, M.; Lamaze, F. C.; Kordahi, M.; Armero, V. S.; Orain, M.; Fiset, P. O.; Joubert, D.; Desmeules, P.; Gagne, A.; Yatabe, Y.; Bosse, Y.; Joubert, P.
Show abstract
BackgroundHistologic descriptors such as lymphovascular invasion (LVI), visceral pleural invasion (VPI), spread through air spaces (STAS), and grading system have each been associated with adverse outcomes in lung adenocarcinoma (LUAD). However, with the exception of VPI, these features are not formally incorporated into the TNM staging system. We evaluated the prognostic value and incremental contribution of these histologic descriptors within the framework of the 9th edition TNM staging system. MethodsIn total, 1,745 individuals diagnosed with stage I-III invasive non-mucinous lung adenocarcinoma (NM-LUAD) were included in this study, comprising 1139 French-Canadian patients who underwent surgical resection at IUCPQ-Universite Laval (discovery cohort) and 606 patients from the National Cancer Center Hospital in Tokyo, Japan (validation cohort). The objective of this study was to assess the prognostic contribution of histologic descriptors, including STAS, and LVI, as complements to conventional 9th edition TNM staging. ResultsGrade 3 tumors, LVI, and STAS were identified in 880 (50.4%), 809 (46.4%), and 775 (44.4%) of 1745 cases, respectively. Histologic grade and LVI demonstrated the strongest associations, particularly in early-stage disease, while STAS exhibited a stage-dependent effect, being more impactful in stages II-III. VPI showed less consistent prognostic value. Incorporating these histologic descriptors into TNM staging improved prognostic model performance, with the largest gains driven by histologic grade and LVI, while STAS provided additional, complementary prognostic refinement. ConclusionThese findings demonstrate that key histologic descriptors--including grading system, LVI, and STAS--represent robust and reproducible prognostic parameters. Importantly, these descriptors provide complementary, stage-dependent information that may enhance risk stratification and inform refinement of future TNM staging frameworks, including the forthcoming 10th edition.
Niggemeier, L.; Hoelscher, D. L.; Herkens, T. C.; Gilles, P.; Boor, P.; Buelow, R.
Show abstract
IntroductionKidney biopsy reports contain rich information that is clinically actionable and useful for research. However, the narrative format hinders scalable reuse. We here investigated whether open-source large language models (LLMs) can extract relevant, standardized readouts from native kidney biopsy pathology reports. MethodsGerman free-text native kidney biopsy reports were parsed with three open-source LLMs (Llama3 70B, Llama3 8B, MedGemma) to generate structured JSON outputs covering relevant report elements (e.g., diagnosis, glomerular counts, histopathological patterns). Two independent observers manually curated the same report elements; disagreements between the two were resolved by an experienced nephropathologist to create the final ground truth. Performance was assessed using strict and soft matching and summarized accuracy. Inter-rated agreement was quantified using Cohens and Lights Kappa with 95% confidence intervals via 1000-times bootstrapping. ResultsLlama3 70B achieved the highest overall accuracy (93.3% strict, 97.1% soft), followed by MedGemma. These larger models showed near perfect performance for explicit and discrete variables and positivity of immunohistochemistry markers, while accuracy decreased for report elements requiring interpretation (e.g., primary diagnosis, interstitial inflammation in fibrosis vs. non-fibrotic cortex). Human raters showed strong agreement for the primary diagnosis ({kappa} = 0.74, 95% CI 0.64-0.84). Adding Llama3 70B or MedGemma as a third rater increased overall agreement (0.82, 95% CI 0.74-0.89 and 0.78, 95% CI 0.69-0.85, respectively), whereas Llama3 8B reduced it. ConclusionsOpen-source LLMs can accurately transform narrative nephropathology reports into a structured and machine-readable format, potentially supporting scalable retrospective cohort building. While some report elements can be extracted without supervision, interpretation-dependent elements should be supervised by a human observer. Lay SummaryRetrospective data collection from nephropathology reports is essential for building informative cohorts in computational nephropathology research, yet manual processing of narrative reports is time-consuming and limits scalability. In this study, we demonstrate that open-source large language models can reliably extract key diagnostic, quantitative, and descriptive data elements from kidney biopsy reports with high accuracy. While factual and clearly stated report elements can be extracted automatically, findings that require contextual or interpretative judgment still benefit from expert supervision. Overall, this approach substantially reduces manual effort and enables efficient generation of structured datasets from diagnostic routine, facilitating the development of kidney registries and future computational nephropathology research. In addition, such systems could be implemented into the routine diagnostic workflow, to directly transform narrative reports into structured data.
Gernand, A. D.; Walker, R.; Pan, Y.; Mehta, M.; Sincerbeaux, G.; Gallagher, K.; Bebell, L. M.; Ngonzi, J.; Catov, J. M.; Skvarca, L. B.; Wang, J. Z.; Goldstein, J. A.
Show abstract
BackgroundPlacental growth and function are imperative for healthy fetal growth; data on placentas can inform research and clinical care. Measuring placental size after delivery should be easy, but current methods are hard to standardize and error prone. We developed PlacentaVision using artificial intelligence (AI)-based models, to automatically, accurately, and precisely measure placentas from digital photographs. ObjectiveWe aimed to compare placental disc morphology between gross pathology examination (human measurements) and our automated PlacentaVision model (AI measurements). MethodsPlacentaVision is a multi-site study to assess placental morphology, features, and pathologies from digital photographs. We built a large dataset of digital placenta photographs and clinical data from singleton births at three large hospitals: Northwestern Memorial (Chicago; n=24,933), UPMC Magee-Womens (Pittsburgh; n=1198) and Mbarara Regional Referral (Uganda, n=1715). Data and images were from the medical record for Northwestern, part of a biobank study for Magee, and from our prospective studies for Mbarara. We compared long and short disc axis length (defined by Amsterdam criteria) between human and AI-based PlacentaVision measurements by calculating the difference and using Bland-Altman; we stratified by site, disc shape, infant sex, and term/preterm birth. ResultsMean (SD) disc length was 19.2 (3.1) and 18.6 (3.1) cm from PlacentaVision and human measurement, respectively, with a difference of 0.57 (2.19) cm. Disc width was 16.3 (2.3) cm and 16.1 (2.4) cm from PlacentaVision and human measurement, respectively, with a difference of 0.25 (1.85) cm. Bland-Altman limits of agreement were -3.7 to 4.9 cm for length and -3.4 to 3.9 cm for width. Irregularly-shaped placentas had a greater difference between PlacentaVision and human measurements compared to those with round/oval shapes (length differences of 1.53 and 0.45 cm respectively). Further, there were length differences by site (Northwestern 0.6, Magee 0.0, and Mbarara 0.4) and gestational age at birth (preterm 0.71, term 0.53 cm), but similar results for male and female placentas. Results for width were similar to length. ConclusionsAI-based measurements were less than a cm from human measurements overall. Our findings of larger differences for irregular shapes and preterm may indicate it is difficult for humans to measure irregular or small placentas according to protocol. PlacentaVision can automate and standardize the process.
Jeong, W. C.; Kim, H. H.; Hwang, Y.; Hwang, G.; Kim, K.; Ko, Y. S.
Show abstract
The Updated Sydney System (USS) provides a standardized framework for grading gastritis and stratifying gastric cancer risk. However, subjective observer variability and labor-intensive workflows impede its routine clinical use. To address these challenges, we developed SydneyMTL, a multi-task deep learning framework that uses Multiple Instance Learning (MIL) with task-specific attention pooling to predict severity grades across all five USS attributes simultaneously. Trained on an unprecedented cohort of 50,765 whole-slide images (WSIs), SydneyMTL generates interpretable histologic evidence for clinical practice. In retrospective evaluations against 24 board-certified pathologists, the model achieved an overall mean lenient accuracy of 89.1%, with 22 pathologists exhibiting >80% agreement with the model. When evaluated on an expert-adjudicated "Golden dataset," the models performance improved to 90.2%, demonstrating its capacity to align with multi-expert consensus and filter individual annotator noise. Latent space analysis confirmed that SydneyMTL captures the ordinal structure of the USS, by representing disease severity as a continuous biological spectrum rather than as disjoint categories. Finally, a randomized crossover reader study showed that AI-assisted review significantly reduced interpretation time and improved inter-observer agreement, establishing SydneyMTL as a scalable tool for supporting standardized gastric cancer risk stratification. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=154 HEIGHT=200 SRC="FIGDIR/small/26346304v1_ufig1.gif" ALT="Figure 1"> View larger version (66K): org.highwire.dtl.DTLVardef@8890daorg.highwire.dtl.DTLVardef@1de007dorg.highwire.dtl.DTLVardef@1f243d1org.highwire.dtl.DTLVardef@425eb9_HPS_FORMAT_FIGEXP M_FIG C_FIG HighlightsO_LISydneyMTL is the first unified framework to simultaneously predict the full 4-tier severity grades across all five Updated Sydney System attributes. C_LIO_LITrained on a massive cohort of 50,765 whole slide images, the model aligns with multi-expert consensus on a rigorous "Golden dataset". C_LIO_LIAI assistance significantly reduces pathologist reading time and harmonizes inter-observer variability in real-world clinical workflows. C_LIO_LILatent space analysis confirms that SydneyMTL preserves the biological ordinality of disease severity without explicit ordinal constraints. C_LI The bigger pictureGastritis is among the most frequent diagnoses in gastrointestinal pathology, and its histologic severity is central to gastric cancer prevention. In routine practice, pathologists convert subtle mucosal changes into semi-quantitative, ordinal grades using the Updated Sydney System, which evaluates five co-existing histologic dimensions. While this framework provides a shared language, grading is labor intensive and inherently dependent on reader-specific thresholds, creating variability that affects risk stratification and surveillance. A key concept motivating our study is that gastritis is not defined by a single finding but by multiple criteria that co-occur and interact. This suggests that computational models should learn these criteria jointly - capturing their biological correlations and the continuum of severity - rather than treating each grade as an isolated classification task. SydneyMTL implements this perspective through a unified multi-task, weakly supervised approach that learns directly from a massive cohort of 50,765 routine whole-slide images. Beyond diagnostic accuracy, our work reveals that the model preserves the ordinality of severity in its representation space, supporting the biological view that discrete clinical categories approximate an underlying continuous biological spectrum. Its attention-based explanations also connect model outputs to interpretable tissue evidence, enhancing clinical trust. Crucially, by harmonizing inter-observer variability, SydneyMTL provides a more reliable foundation for gastric cancer risk assessment, ensuring that premalignant changes are captured with greater consistency. More broadly, our findings reposition AI for gastritis from narrow detection toward scalable, evidence-based decision support that can standardize grading practices and reduce cognitive burden on the global pathology workforce.
Jiang, B.; Zhang, Y.; Sheng, H.; Wang, Q.; Hu, B.; Wang, L.; Fu, J.
Show abstract
ObjectiveTo explore the application value of dual-staining for specific AT sequence binding protein 2 (SATB2) immunohistochemistry and elastic lamina in detecting elastic lamina invasion (ELI) in pT3 colon cancer, and to assess its association with clinicopathological characteristics, staging, and prognosis. MethodsThis retrospective cohort study enrolled 176 pT3 colon cancer patients who underwent radical resection at Affiliated Jinhua Hospital Zhejiang University School of Medicine. The deepest tumor-infiltrated paraffin blocks were collected for SATB2 immunohistochemistry and elastin dual-staining. Correlations between ELI status and clinicopathological characteristics and prognosis were analyzed. Survival data of 74 pT4a stage patients were collected for comparative analysis. ResultsELI (+) was positively associated with high tumor budding grade, vascular invasion, lymph node metastasis, and reduced tumor infiltrating lymphocytes (TILs) (all P < 0.001). No correlations were observed with age, gender, tumor location, histological subtype, tumor grade, or perineural invasion (all P > 0.05). The ELI (+) group exhibited significantly shorter disease-free survival (DFS) and overall survival (OS) compared to ELI (-) group (P < 0.05). Additionally, the ELI (+) group demonstrated inferior OS than the pT4a group, though DFS did not differ significantly. ConclusionDual-staining of SATB2 immunohistochemistry and elastic lamina provides a reproducible and objective method for assessing ELI. ELI correlates with key clinicopathological features and functions as an independent adverse prognostic indicator in pT3 colon cancer.
Just, M. K.; Christensen, K. B.; Wirenfeldt, M.; Steiniche, T.; Parkkinen, L.; Myllykangas, L.; Borghammer, P.
Show abstract
ObjectiveBrain branks preserve extensive material relevant to neurodegenerative disease research. As these collections age, tissue becomes archival, raising the question of whether long-term fixed and stored human brain tissue remains suitable for contemporary immunohistochemical analyses. Materials and MethodsForty-one autopsy brains collected between 1946 to 1980 were examined. For each case, midbrain and hippocampus were available both as original paraffin-embedded blocks and as tissue stored long term in fixative. New paraffin blocks were prepared from the long-term fixated tissue. Sections from original and newly prepared blocks were immunohistochemically stained for -synuclein, hyperphosphorylated tau and amyloid-{beta}. Immunoreactivity was assessed using semi-quantitative scoring. ResultsOriginal blocks consistently showed good staining intensity and morphological preservation for each protein pathology. Newly prepared blocks showed slightly lower semi-quantitative scores for Lewy-related pathology, without statistically significant differences, except for astrocytic -synuclein in the substantia nigra in cases from the 1960s. Tau pathology displayed modestly reduced labelling, particularly of the neuropil threads and neurofibrillary tangles, most evident in cases from the 1950s. Amyloid-{beta}-positive senile plaques showed similar or slightly higher scores in newly prepared blocks, with no significant differences across regions. ConclusionHuman brain tissue preserved as paraffin-embedded blocks or stored in fixative for up to 78 years remains suitable for immunohistochemical analyses. Adequate-to-good detection of aggregated of -synuclein, hyperphosphorylated tau and amyloid-{beta} is achievable, indicating preserved pathological hallmarks of Lewy Body Disease and Alzheimers Disease in archival tissue.
Heysmond, S.; Kyratzi, P.; Wattis, J.; Paldi, A.; Brookes, K.; Kreft, K. L.; Shao, B.; Rauch, C.
Show abstract
Background: Quantitative genome wide association studies (GWAS) primarily rely on additive linear models that compare average phenotypic differences between genotype groups. While effective for detecting common variants of moderate effect in large sample sizes, such approaches inherently reduce high resolution phenotypic data to summary statistics (group averages), potentially limiting the detection of subtle genotype phenotype relationships. Genomic Informational Field Theory (GIFT) is a recently developed methodology that preserves the fine-grained informational structure of quantitative traits by analysing ranked phenotypic configurations rather than relying solely on mean differences. Methods: We applied GIFT to genetic and neuropathological data from the Brains for Dementia Research cohort, a well characterised dataset of 563 individuals, and compared its performance with conventional GWAS. Principal component analysis (PCA) derived matrix was used to derive independent quantitative traits linked to from Alzheimer disease (AD) neuropathology measures (CERAD, Thal, Braak staging), with and without inclusion of age at death. Principal component analyses were performed using GWAS and GIFT frameworks on the same filtered genotype dataset. Results: Both GWAS and GIFT identified genome-wide significant associations (pvalue<0.000001) within the APOE locus (NECTIN2/TOMM40/APOE/APOC1), demonstrating concordance with established AD genetic variants. However, GIFT detected additional significant 19 SNPs beyond those identified by GWAS. Variants associated with AD pathology implicated genes involved in amyloid processing, neuronal apoptosis, synaptic function, neuroinflammation, and metabolic regulation. Notably, GIFT identified 29 loci associated with age at death related variation that were not detected by GWAS, highlighting genes linked to lipophagy, mitochondrial quality control, sphingolipid metabolism, frailty, and aging-related processes. Conclusions: GIFT recapitulates canonical GWAS findings while uncovering additional biologically relevant associations. By preserving the fine-grained structure of phenotypic data distributions and detecting non random genotype segregation across ranked trait values, GIFT enables the identification of associations that remained undetected by traditional average based GWAS approaches. These results demonstrate that rethinking analytical representation, rather than solely increasing sample size, can expand discovery potential of genetic association studies, offering a transparent and complementary framework for quantitative genomics in deeply phenotyped datasets.
Kibera, J.; Bender, J. B.; Kobia, F. M.; Kibaya, R.; Gitonga, M.; Gitonga, F.; Ondieki, F.; Killingo, B.; Kepha, S.; Achakolong, M.; Gelalcha, B.; Mahero, M.
Show abstract
BackgroundHepatocellular carcinoma (HCC) is a leading cause of cancer-related death in sub-Saharan Africa (SSA). Differentiating primary HCC from metastatic liver tumors remains a significant diagnostic challenge. Understanding the prevalence and clinical predictors of HCC is crucial for improving diagnosis and patient care. This study examined the prevalence of hepatitis B virus (HBV), hepatitis C virus (HCV), and HCC, and clinical predictors of HCC. MethodsWe used immunohistochemical markers on archived liver tumor biopsies and analyzed the data using descriptive and logistic regression analysis. ResultsAmong 58 liver carcinoma cases, 37.9% had HCC, and 62% had metastatic liver carcinoma (MLC). HCC was most common (61.5%) among middle-aged adults (50-59 years). HCC was more frequent in males (47.2%) than in females (22.7%). Over half of the patients (51.7%) tested positive for HBV. HCC was more prevalent in HBV-positive patients than HBV-negative ones (43.3% vs 32.1%). Hepatic fibrosis was identified in 27.6% of cases. HCC was more common in patients with fibrosis (56.2%) than in those without (31%). HCV infection was rare (6.9%) in this study. In multivariable logistic regression analysis, none of the examined predictors reached statistical significance (P>0.05). Patients aged 50-59 years, males, those with HBV infection, and hepatic fibrosis showed higher odds of HCC. Hepatocyte Paraffin-1 (Hep Par-1) demonstrated 97% specificity and a 95% positive predictive value (PPV) for differentiating HCC from MLC. The combined marker pattern of Hep Par-1 positive and AE1/AE3 negative was highly predictive of HCC (100% specificity, 100% PPV, and 93.2% diagnostic accuracy). ConclusionsOur findings indicate that while the assessed risk factors tend to show directional association with HCC, as expected, larger studies are needed to determine their independent effects. The combined Hep Par-1 AE1/AE3 immunophenotype is more accurate than either marker alone. Therefore, this combined test is a valuable diagnostic tool for confirming HCC in resource-limited settings.
Hoe, Z. Y.; Ding, R.-S.; Chou, C.-P.; Hu, C.; Lee, C.-H.; Tzeng, Y.-D.; Pan, C.-T.; Lee, M.-C.; Lee, E. K.-L.
Show abstract
BackgroundBreast cancer-related lymphedema (BCRL) is a common complication following breast cancer treatment. While lymphoscintigraphy is considered the diagnostic gold standard, it is unsuitable for routine periodic monitoring or assessment of treatment efficacy. Shear wave elastography (SWE) offers a possible alternative, but traditional modes of operation limit its potential. Proposed SolutionsThe Holder-Optimized Elastography (HOE) method is introduced to eliminate pressure issues introduced by manual operation of ultrasound probes by stabilizing them above the cutis. MethodsThe HOE method was used to acquire ARFI images of high-velocity areas (HVAs, with shear wave velocity greater than 7 m/s) in limbs with and without BCRL (as confirmed and characterized by lymphoscintigraphy) in two cohorts of 15 and 125 patients. ResultsThe HOE method enabled ARFI elastography to directly and consistently visualize the effects caused by both obstructed lymphatic vessels and intraluminal lymphatic fluid as HVAs, whereas traditional hand-held methods did not. Inter-limb differences in HVA burden showed moderate diagnostic performance for detecting BCRL and grading obstruction with modest sensitivity. However, there was systematic underestimation of both early and confluent advanced lesions. ConclusionHOE-based HVA imaging has potential for rapid and non-invasive monitoring of lymphedema course and treatment response and may serve as a useful adjunct to existing diagnostic tools for BCRL. However, further technical refinements and quantitative analytic methods will be required to fully exploit the richer SWV information provided by HOE and to enhance the diagnostic utility of HVAs. Summary StatementThe Holder-Optimized Elastography method ("HOE" method) increases the diagnostic capability of ARFI elastography for breast cancer-related lymphedema, allowing for the non-invasive detection of some lymphatic obstructions but not all. Key ResultsThe Holder-Optimized Elastography (HOE) method revealed the effects caused by fluid-filled lymphatic vessels as "High-Velocity Areas" (HVAs), which are difficult to detect by conventional methods. HVA counts for detecting lymphedema (any obstruction vs. no obstruction) showed high specificity (0.86-1.00) but low sensitivity (0.57-0.67). Conversely, HVA counts for staging lymphedema (i.e. total vs. partial obstruction) showed high sensitivity (up to 1.00) but low specificity (0.48-0.66). The inter-limb difference of HVAs counted in whole-limb scans between affected and unaffected limbs (aka, the "Global Mean Difference") provided the most balanced diagnostic performance (sensitivity 0.67-0.79, specificity 0.88-0.89).
McNeil, M.; Ramanathan, V.; Bassiouny, D.; Nofech-Mozes, S.; Rakovitch, E.; Martel, A. L.
Show abstract
BackgroundAlthough DCIS has a relatively low recurrence rate, many patients still receive adjuvant radiotherapy or endocrine therapy, raising concerns about overtreatment. Reliable biomarkers are therefore needed to predict an individual patients risk and guide treatment decisions. Recent studies suggest that the composition of the tumour-associated stroma (TAS) affects progression and outcome, highlighting TAS-derived biomarkers as promising candidates for further investigation. MethodsWe trained AI models for cell and tumour segmentation using whole slide digital pathology images acquired as part of a retrospective cohort study. We investigated the effects of cell density within both the tumour and the TAS to determine how they correlated with recurrence in the ipsilateral breast. ResultsWe found that the concentration of DCIS lesions on the slide and the density of mitotic figures inside the TAS region were significantly associated with recurrence risk. Additionally, we found some predictive value in the lymphocyte and red blood cell densities in different tumour regions. Stromal composition was shown to associate with recurrence risk, and density-based biomarkers were identified and used to cluster patients into phenotypes with significantly different risk profiles. ConclusionOur findings highlight the prognostic relevance of stromal composition in DCIS, and we identify novel density-based biomarkers that can be used to identify patients who are more likely to experience a local recurrence after breast-conserving surgery alone. These results may aid in developing future risk-stratification tools for breast cancer patients, thereby reducing overtreatment and improving patient care.
Bolig, T. C.; Grudzinski, K.; Shawabkeh, M.; Selvan, K. C.; Goodwin, R. J.; Olson, E.; Bemiss, B. C.; Parekh, N.; Savas, H.; Dematte, J. E.; Esposito, A. J.
Show abstract
ObjectiveMyositis-associated interstitial lung disease (myositis-ILD) consists of two predominant radiologic patterns of lung injury--nonspecific interstitial pneumonia (NSIP) and organizing pneumonia (OP)--that oftentimes coexist. However, it remains unclear whether either is associated with clinical outcomes. We aimed to assess the therapeutic response in patients with NSIP-compared to those with OP-predominant myositis-ILD. MethodsThis retrospective, single-center cohort study recruited participants from the Northwestern University ILD Registry with a circulating myositis-associated antibody, ILD, and at least 6 months of follow-up while on immunomodulatory therapy during a 24-month observation period after diagnosis. Two thoracic radiologists determined the predominant radiologic pattern (NSIP or OP). The primary outcome was the absolute change in forced vital capacity (FVC) at 24 months post-diagnosis. Secondary outcomes included changes in the diffusing capacity of the lung for carbon monoxide (DLCO) and radiologic qualitative and quantitative measures of lung injury. ResultsForty-one participants were included in analyses. 71% had an OP-predominant while 29% had an NSIP-predominant radiologic pattern of lung injury. Both exposure cohorts had improvement in mean absolute FVC (OP cohort = +0.18L [p=0.005], NSIP cohort = +0.24L [p=0.07]) over the 24-month observation period. The OP (p<0.05) but not the NSIP cohort (p=0.20) had an increase in DLCO. The OP cohort demonstrated improvement in the qualitative assessment of follow-up imaging (p<0.05), driven by quantitative improvement in groundglass/consolidative opacities (p=0.006). A subset of participants demonstrated features of NSIP/OP overlap and had greater baseline radiologic severity of lung injury. ConclusionPatients with circulating myositis-associated antibodies and an OP-predominant pattern of lung injury may have a more favorable response to therapy than those with NSIP. Further studies are needed to validate our findings and delineate other features cognate with these associations. Significance and InnovationsO_LIRadiologic phenotyping may predict therapeutic response in myositis-ILD. This study demonstrates that an OP-predominant computed tomography (CT) pattern of lung injury is associated with greater improvement in lung function and radiologic signs of inflammation over 24 months on at least 6 months of immunomodulatory therapy compared with an NSIP-predominant pattern, suggesting that CT pattern may provide clinically meaningful prognostic information. C_LIO_LIFirst study to integrate blinded qualitative radiologic adjudication with quantitative CT scoring in myositis-ILD. By combining dual-radiologist review with Kazerooni quantitative scoring and longitudinal pulmonary function testing, this study offers a rigorous and multidimensional assessment of treatment response. C_LIO_LIExpands risk stratification beyond antibody-based toward imaging-based phenotyping strategies. In a heterogeneous population defined by diverse myositis-associated antibodies, this work introduces radiologic pattern as a practical and accessible framework for anticipating treatment responsiveness. C_LIO_LIProvides hypothesis-generating data for precision management in myositis-ILD. The findings support the concept that imaging-defined subgroups may exhibit differential therapeutic trajectories, laying groundwork for future multicenter studies integrating CT phenotype, antibody profile, and treatment strategy. C_LI
Sahin, S.; Diaz, E.; Rajagopal, A.; Abtahi, M.; Jones, S.; Dai, Q.; Kramer, S.; Wang, Z.; Larson, P. E. Z.
Show abstract
Current standard of care imaging practices cannot reliably differentiate among certain renal tumors such as benign oncocytoma and clear cell renal cell carcinoma (RCC), and between low and high grade RCCs. Previous work has explored using deep learning, radiomics, and texture analysis to predict renal tumor subtypes and differentiate between low and high grade RCCs with mixed success. To further this work, large diverse datasets are needed to improve model performance and provide strong evaluation sets. In this work, a dataset of 831 multi-phase 3D CT exams was curated. Each exam contains up to three contrast-enhanced CT phases. Tumor outlines or bounding boxes were annotated and registered to the image volumes. The pathology results for each tumor and relevant patient metadata are also included.
Kästingschäfer, K. F.; Fink, A.; Rau, S.; Reisert, M.; Kellner, E.; Nolde, J. M.; Kottgen, A.; Sekula, P.; Bamberg, F.; Russe, M. F.
Show abstract
Rationale and ObjectivesContrast-enhanced (CE) MRI provides clear corticomedullary contrast for renal compartment delineation but may be contraindicated or undesirable in routine practice. We aimed to enable automated extraction of renal imaging biomarkers from routine non-contrast-enhanced (NCE) T1-weighted MRI by transferring CE-derived compartment labels. Materials and MethodsThis retrospective single-center study (January 2017 to December 2021) included 200 participants with paired arterial-phase CE and NCE T1-weighted MRI. Cortex, medulla, and sinus were manually segmented on CE MRI and rigidly transferred to NCE MRI to provide voxel-level reference labels. A hierarchical 3D Deep Neural Patchworks model was trained on 100 examinations (90 training/10 validation) and evaluated on an independent test set of 100 examinations using the transferred CE masks on NCE as reference. Performance was assessed using Dice similarity of segmentations and biomarker agreement using volumes and surface areas (Pearson/Spearman, MAE, Lins CCC, and Bland-Altman). ResultsWhole-kidney segmentation Dice was 0.950 (left) and 0.953 (right). Total kidney volume showed high agreement with minimal bias (MAE 8.76 mL, 2.5% of mean; CCC 0.983; bias -1.56 mL; 95% limits of agreement -28.81 to 25.69 mL). Cortex volume was modestly overestimated and medulla volume underestimated, shifting predicted compartment fractions toward cortex (74.7% vs. 72,1% in ground truth; medulla 21.5% vs. 24.3%; sinus 3.8% vs. 3.6%. Sinus volume maintained high concordance despite higher Dice dispersion. Surface area was systematically underestimated with low concordance. ConclusionCE-supervised knowledge transfer enables accurate, well-calibrated kidney volumetry from routine NCE MRI and supports contrast-free renal biomarker extraction. Surface area estimation remains challenging. Take-home MessagesO_LICE-supervised label transfer enables accurate, well-calibrated contrast-free kidney volumetry on routine non-contrast T1-weighted MRI. C_LIO_LICompartment volumetry is feasible but shows systematic cortex overestimation and medulla underestimation; surface area remains non-interchangeable due to boundary uncertainty. C_LI
Khan, U.; Shah, S.; Luna-Victoria, G.; Groves, L.; Ramos, D.; Sirota, M.; Oskotsky, T.
Show abstract
ObjectiveTo retrospectively validate an electronic health record (EHR) implementation of the patient-initiated PreMA screener and compare its association with severe maternal morbidity (SMM) outcomes against established obstetric comorbidity indices. MethodsWe conducted a retrospective observational study using UCSF (single center) and UC-wide (multi-center) de-identified EHR data, identifying live-birth deliveries with documented preconception data. PreMA and established comorbidity index (Bateman and Leonard) scores were computed from preconception diagnoses, standardized to z-scores, and modeled as continuous predictors of SMM and non-transfusion SMM (NT-SMM) using logistic and Poisson regression models, with stratified analyses by race, ethnicity, and neighborhood deprivation. To examine the relationship between individual PreMA questionnaire domains and outcomes, we used adjusted Poisson regression to estimate the association of each domain with SMM and NT-SMM. ResultsAcross both cohorts, higher standardized PreMA, Bateman, and Leonard scores were consistently significantly associated with increased risk of SMM and NT-SMM, with relative risk estimates generally in the [~]1.2-1.4 range per standard deviation (adj. p < 0.001), and similar magnitude across indices and cohorts. Significant associations persisted across racial, ethnic, and socioeconomic, and item-level analyses suggested heterogeneity across PreMA domains, with cardiovascular domains showing the strongest adjusted associations. ConclusionAn EHR-derived PreMA score demonstrated robust, generalizable associations with severe maternal morbidity outcomes comparable to established clinician-facing indices, supporting PreMAs validity as a scalable, patient-centered preconception risk assessment tool.
Tong, T.; Zhang, W.; Zu, W.
Show abstract
Accurate polyp segmentation from colonoscopy images is critical for colorectal cancer prevention, yet the generalization of deep learning models under domain shift remains insufficiently explored. We propose Boundary-Explicit Guided Attention U-Net (BEGA-UNet), a boundary-aware segmentation architecture that introduces explicit edge modeling as a structural inductive bias to enhance both segmentation accuracy and cross-domain robustness. The framework integrates three components: an Edge-Guided Module (EGM) with learnable Sobel-initialized operators to capture boundary cues, a Dual-Path Attention (DPA) module that processes channel and spatial attention in parallel, and a Multi-Scale Feature Aggregation (MSFA) module to encode contextual information across multiple receptive fields. Evaluated on the combined Kvasir-SEG and CVC-ClinicDB benchmarks, BEGA-UNet achieves 88.53% Dice and 82.51% IoU, outperforming representative convolutional and transformer-based baselines. More importantly, cross-dataset evaluation demonstrates strong robustness under domain shift, with BEGA-UNet retaining 83.2% of its in-distribution performance-substantially higher than U-Net (64.5%), Attention U-Net (47.5%), and TransUNet (53.1%). In a zero-shot setting on an entirely unseen dataset, the model further maintains 72.6% performance retention. Comprehensive ablation studies indicate that explicit boundary modeling plays a central role in improving generalization, while multi-scale context aggregation further stabilizes performance across domains. Feature distribution analyses support this observation by showing that edge-oriented representations exhibit markedly reduced cross-domain variability compared to appearance-driven features. Overall, BEGA-UNet provides an effective and interpretable solution for robust polyp segmentation, demonstrating that explicit boundary modeling serves as a critical inductive bias for ensuring reliability under clinical domain shifts.
Cassim, N.; Stevens, W. S.; Glencross, D. K.; Coerzee, L.-M.
Show abstract
BackgroundIn 2004, South Africas public health system faced the dual challenge of rapidly scaling up antiretroviral therapy (ART) while reducing the cost of laboratory monitoring. At the time, conventional CD4 testing methods were expensive, labour-intensive, and impractical for sustaining a national testing network. This study aimed to assess the financial impact and cost savings associated with the implementation of the PanLeucogated CD4 (PLG/CD4) enumeration method between 2004 and 2024 in the public-sector in South Africa. MethodsA longitudinal cost analysis was conducted using annual test volumes and state tariffs for PLG/CD4 testing and the 4-colour CD3/CD4/CD8/CD45 T-cell enumeration reference method. Annual cost savings were calculated in United States Dollars (USD) by applying historical South African Rands (ZAR) to United States Dollars (USD) exchange rates. The state prices for tariff codes PLG/CD4 and the reference method were provided by calendar year in ZAR and converted to USD based on the prevailing exchange rate. The USD test prices were multiplied by annual test volumes. Cost savings were calculated by multiplying annual test volumes and the difference in test prices in USD (difference between PLG/CD4 and the reference method). ResultsThere were 50,745,848 PLG/CD4 tests performed over 20-years. The cost-per-test of PLG/CD4 was consistently lower than the reference method, ranging from $4,06 to $9,40, compared to $13,06 to $28,21. Cumulative national savings amounted to USD 626 million. The peak annual savings of $64,6 million occurred in 2011, coinciding with the height of ART enrolment. Cost savings persisted despite a doubling in the exchange rate over the study period. ConclusionThe PLG/CD4 implementation enabled cost-efficient, scalable, quality-assured CD4 testing as part of the national HIV response, reducing reliance on complex/costly technologies while improving coverage. These findings support the critical role of context-specific diagnostic innovation to strengthen health system resilience.